SEQUENCE LISTING ^. 

( lV ^NERAL, ^VFORMAT ION : 

(i) APPLICANT: Reyes, Gregory R. 

Yarbough, Patrice 0 
Bradley, Daniel W 
Krawczynski, Krzysztof Z 
Tarn, Albert 
Fry, Kirk E 

(ii) TITLE OF INVENTION: DNA Sequences of Enterically Transmitted 
Non-A/Non-B Hepatitis Viral Agent 

(iii) NUMBER OF SEQUENCES: 20 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Dehlinger & Associates 

(B) STREET: 350 Cambridge Avenue, Suite 250 

(C) CITY: Palo Alto 
<D) STATE: CA 

(E) COUNTRY: USA 

(F) ZIP : 94306 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25' 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 09/128,275 

(B) FILING DATE: 03-AUG-1998 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/279,823 

(B) FILING DATE: 25-JUL-1994 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/681,078 

(B) FILING DATE: 05-APR-1991 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/505,888 

(B) FILING DATE: 05-APR-1990 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/420,921 

(B) FILING DATE: 13-OCT-1989 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/367,486 

(B) FILING DATE: 16-JUN-1989 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/336,672 

(B) FILING DATE: ll-APR-1989 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/208,997 

(B) FILING DATE: 17-JUN-1988 
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(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Petithory, Joanne R. 

(B) REGISTRATION NUMBER: 42,995 

(C) REFERENCE /DOCKET NUMBER: 4 600-0183.24 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (650) 324-0880 

(B) TELEFAX: (650) 324-0960 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1295 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 1 . 33 kb EcoRI insert of ET1.1, 
forward sequence 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1293 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 2.. 12 94 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 3.. 1295 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

AGACCTGTCC CTGTTGCAGC TGTTCTACCA CCCTGCCCCG AGCTCGAACA GGGCCTTCTC 60 

TACCTGCCCC AGGAGCTCAC CACCTGTGAT AGTGTCGTAA CATTTGAATT AACAGACATT 120 

GTGCACTGCC GCATGGCCGC CCCGAGCCAG CGCAAGGCCG TGCTGTCCAC ACTCGTGGGC 180 

CGCTACGGCG GTCGCACAAA GCTCTACAAT GCTTCCCACT CTGATGTTCG CGACTCTCTC 240 

GCCCGTTTTA TCCCGGCCAT TGGCCCCGTA CAGGTTACAA CTTGTGAATT GTACGAGCTA 300 

GTGGAGGCCA TGGTCGAGAA GGGCCAGGAT GGCTCCGCCG TCCTTGAGCT TGATCTTTGC 360 

AACCGTGACG TGTCCAGGAT CACCTTCTTC CAGAAAGATT GTAACAAGTT CACCACAGGT 420 

GAGACCATTG CCCATGGTAA AGTGGGCCAG GGCATCTCGG CCTGGAGCAA GACCTTCTGC 480 

GCCCTCTTTG GCCCTTGGTT CCGCGCTATT GAGAAGGCTA TTCTGGCCCT GCTCCCTCAG 540 

GGTGTGTTTT ACGGTGATGC CTTTGATGAC ACCGTCTTCT CGGCGGCTGT GGCCGCAGCA 600 
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AAGGCATCCA 


TGGTGTTTGA 


GAATGACTTT 


TCTGAGTTTG 


ACTCCACCCA 


GAATAACTTT . 


660 


TCTCTGGGTC 


TAGAGTGTGC 


TATTATGGAG 


GAGTGTGGGA 


TGCCGCAGTG 


GCTCATCCGC 


720 


CTGTATCACC 


TTATAAGGTC 


TGCGTGGATC 


TTGCAGGCCC 


CGAAGGAGTC 


TCTGCGAGGG 


' 780 


TTTTGGAAGA 


AACACTCCGG 


TGAGCCCGGC 


ACTCTTCTAT 


GGAATACTGT 


CTGGAATATG 


; 840 


GCCGTTATTA 


CCCACTGTTA 


TGACTTCCGC 


GATTTTCAGG 


TGGCTGCCTT 


TAAAGGTGAT 


900 


GATTCGATAG 


TGCTTTGCAG 


TGAGTATCGT 


CAGAGTCCAG 


GAGCTGCTGT 


CCTGATCGCC 


960 


GGCTGTGGCT 


TGAAGTTGAA 


GGTAGATTTC 


CGCCCGATCG 


GTTTGTATGC 


AGGTGTTGTG 


1020 


GTGGCCCCCG 


GCCTTGGCGC 


GCTCCCTGAT 


GTTGTGCGCT 


TCGCCGGCCG 


GCTTACCGAG 


1080 


AAGAATTGGG 


GCCCTGGCCC 


TGAGCGGGCG 


GAGCAGCTCC 


GCCTCGCTGT 


TAGTGATTTC 


1140 


CTCCGCAAGC 


TCACGAATGT 


AGCTCAGATG 


TGTGTGGATG 


TTGTTTCCCG 


TGTTTATGGG 


1200 


GTTTCCCCTG 


GACTCGTTCA 


TAACCTGATT 


GGCATGCTAC 


AGGCTGTTGC 


TGATGGCAAG 


1260 


GCACATTTCA 


CTGAGTCAGT 


AAAACCAGTG 


CTCGA 






1295 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 431 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Arg Pro Val Pro Val Ala Ala Val Leu Pro Pro Cys Pro Glu Leu Glu 
1,5 10 15 

Gin Gly Leu Leu Tyr Leu Pro Gin Glu Leu Thr Thr Cys Asp Ser Val 
20 25 30 

Val Thr Phe Glu Leu Thr Asp He Val His Cys Arg Met Ala Ala Pro 
35 40 45 

Ser Gin Arg Lys Ala Val Leu Ser Thr Leu Val Gly Arg Tyr Gly Gly 
50 55 60 

Arg Thr Lys Leu Tyr Asn Ala Ser His Ser Asp Val Arg Asp Ser Leu 
65 70 75 80 

Ala Arg Phe He Pro Ala He Gly Pro Val Gin Val Thr Thr Cys Glu 
85 90 95 

Leu Tyr Glu Leu Val Glu Ala Met Val Glu Lys Gly Gin Asp Gly Ser 
100 105 110 

Ala Val Leu Glu Leu Asp Leu Cys Asn Arg Asp Val Ser Arg He Thr 
115 120 125 

Phe Phe Gin Lys Asp Cys Asn Lys Phe Thr Thr Gly Glu Thr He Ala 
130 135 140 
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His Gly Lys Val Gly Gin Gly He Ser Ala Trp Ser Lys Thr Phe Cys 
145 150 155 160 

Ala Leu Phe Gly Pro Trp Phe Arg Ala He Glu Lys Ala He Leu Ala 
165 170 175 

Leu Leu Pro Gin Gly Val Phe Tyr Gly Asp Ala Phe Asp Asp Thr Val 
180 185 190 

Phe Ser Ala Ala Val Ala Ala Ala Lys Ala Ser Met Val Phe Glu Asn 
195 200 205 

Asp Phe Ser Glu Phe Asp Ser Thr Gin Asn Asn Phe Ser Leu Gly Leu 
210 215 220 

Glu Cys Ala He Met Glu Glu Cys Gly Met Pro Gin Trp Leu He Arg 
225 230 235 240 

Leu Tyr His Leu He Arg Ser Ala Trp He Leu Gin Ala Pro Lys Glu 
245 250 255 

Ser Leu Arg Gly Phe Trp Lys Lys His Ser Gly Glu Pro Gly Thr Leu 
260 265 270 

Leu Trp Asn Thr Val Trp Asn Met Ala Val He Thr His Cys Tyr Asp 
275 280 285 

Phe Arg Asp Phe Gin Val Ala Ala Phe Lys Gly Asp Asp Ser He Val 
290 295 300 

Leu Cys Ser Glu Tyr Arg Gin Ser Pro Gly Ala Ala Val Leu He Ala 
305 310 315 320 

Gly Cys Gly Leu Lys Leu Lys Val Asp Phe Arg Pro He Gly Leu Tyr 
325 330 335 

Ala Gly Val Val Val Ala Pro Gly Leu Gly Ala Leu Pro Asp Val Val 
340 345 350 

Arg Phe Ala Gly Arg Leu Thr Glu Lys Asn Trp Gly Pro Gly Pro Glu 
355 360 365 

Arg Ala Glu Gin Leu Arg Leu Ala Val Ser Asp Phe Leu Arg Lys Leu 
370 375 380 

Thr Asn Val Ala Gin Met Cys Val Asp Val Val Ser Arg Val Tyr Gly 
385 390 395 400 

Val Ser Pro Gly Leu Val His Asn Leu He Gly Met Leu Gin Ala Val 
405 410 415 



Ala Asp Gly Lys Ala His Phe Thr Glu Ser Val Lys Pro Val Leu 
420 425 430 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C> INDIVIDUAL ISOLATE: linker - top (5') sequence 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GGAATTCGCG GCCGCTCG 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: linker - bottom (3*) sequence 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

CGAGCGGCCG CGAATTCCTT 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1295 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 1.33 kb EcoRI insert of ET1.1, 
reverse sequence 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
TCGAGCACTG GTTTTACTGA CTCAGTGAAA TGTGCCTTGC CATCAGCAAC AGCCTGTAGC 
ATGCCAATCA GGTTATGAAC GAGTCCAGGG GAAACCCCAT AAACACGGGA AACAACATCC 
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ACACACATCT 


GAGCTACATT 


CGTGAGCTTG 


CGGAGGAAAT 


CACTAACAGC 


GAGGCGGAGC 


180 


TGCTCCGCCC 


GCTCAGGGCC 


AGGGCCCCAA 


TTCTTCTCGG 


TAAGCCGGCC 


GGCGAAGCGC 


240 


ACAACATCAG 


GGAGCGCGCC 


AAGGCCGGGG 


GCCACCACAA 


CACCTGCATA 


CAAACCGATC 


300 


GGGCGGAAAT 


CTACCTTCAA 


CTTCAAGCCA 


CAGCCGGCGA 


TCAGGACAGC 


AGCTCCTGGA 


360 


CTCTGACGAT 


ACTCACTGCA 


AAGCACTATC 


GAATCATCAC 


CTTTAAAGGC 


AGCCACCTGA 


420 


AAATCGCGGA 


AGTCATAACA 


GTGGGTAATA 


ACGGCCATAT 


TCCAGACAGT 


ATTCCATAGA 


480 


AGAGTGCCGG 


GCTCACCGGA 


GTGTTTCTTC 


CAAAACCCTC 


GCAGAGACTC 


CTTCGGGGCC 


540 


TGCAAGATCC 


ACGCAGACCT 


TATAAGGTGA 


TACAGGCGGA 


TGAGCCACTG 


CGGCATCCCA 


600 


CACTCCTCCA 


TAATAGCACA 


CTCTAGACCC 


AGAGAAAAGT 


TATTCTGGGT 


GGAGTCAAAC 


660 


TCAGAAAAGT 


CATTCTCAAA 


CACCATGGAT 


GCCTTTGCTG 


CGGCCACAGC 


CGCCGAGAAG 


720 


ACGGTGTCAT 


CAAAGGCATC 


ACCGTAAAAC 


ACACCCTGAG 


GGAGCAGGGC 


CAGAATAGCC 


780 


TTCTCAATAG 


CGCGGAACCA 


AGGGCCAAAG 


AGGGCGCAGA AGGTCTTGCT 


CCAGGCCGAG 


840 


ATGCCCTGGC 


CCACTTTACC 


ATGGGCAATG 


GTCTCACCTG 


TGGTGAACTT 


GTTACAATCT 


900 


TTCTGGAAGA 


AGGTGATCCT 


GGACACGTCA 


CGGTTGCAAA 


GATCAAGCTC 


AAGGACGGCG 


960 


GAGCCATCCT 


GGCCCTTCTC 


GACCATGGCC 


TCCACTAGCT 


CGTACAATTC 


ACAAGTTGTA 


1020 


ACCTGTACGG 


GGCCAATGGC 


CGGGATAAAA 


CGGGCGAGAG 


AGTCGCGAAC 


ATCAGAGTGG 


1080 


GAAGCATTGT 


AGAGCTTTGT 


GCGACCGCCG 


TAGCGGCCCA 


CGAGTGTGGA 


CAGCACGGCC 


1140 


TTGCGCTGGC 


TCGGGGCGGC 


CATGCGGCAG 


TGCACAATGT 


CTGTTAATTC 


AAATGTTACG 


1200 


ACACTATCAC 


AGGTGGTGAG 


CTCCTGGGGC 


AGGTAGAGAA 


GGCCCTGTTC 


GAGCTCGGGG 


1260 


CAGGGTGGTA 


GAACAGCTGC 


AACAGGGACA 


GGTCT 
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(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7195 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: HEV - Burma strain 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 28.. 5106 
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(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 5147.. 7126 

(ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION: 5106.. 5474 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 





\^r\ inivji VJO 1 


PHATHPPATP, 


nAP,P.PPPATP 


nu 1 1 1A1 1AA 


GGG1GG1GGG 




ATPAPTAPTft 


PTATTHACPA 


bbL 1 ± U J. A 




AG 1 G 1 GGGG X 


GGCGAATGCT 








p "t* p t p n pp n p 


p7}p T A r P rn P T AP7\ 
GAGA I 1GAGA 


I GG 1 GA 1 1 AA 


CCTAATGCAA 


180 


LL 1 GGGGAGG 


rn rp f* rn rn rp m /™i ^» 

1 iGl 1 I 1GGG 


GGGGGAGG 1 1 


I 1 G 1 GGAA 1 G 


ATCCCATCCA 


GCGTGTCATC 


240 


G A 1 AAL G AG G 


1 GGAGG 1 I 1 A 


G 1 GGGGGGGG 


GGGIGCGGCG 


GCTGTCTTGA 


AATTGGCGCC 


300 


GAlGGGGGGi 


r~* 7\ 7\ rp 7\ TV 7\ rn t\ 

GAA1 AAAI GA 


TAATCCTAAT 


GTGGTCCACC 


GCTGCTTCCT 


CCGCCCTGTT 


360 


GGGGGT.GATG 


TTCAGCGCTG 


GTATACTGCT 


CCCACTCGCG 


GGCCGGCTGC 


TAATTGCCGG 


-420 


GG1 1GGGGGG 


TGCGCGGGCT 


TCCCGCTGCT 


GACCGCACTT 


ACTGCCTCGA 


CGGGTTTTCT 


' 480 


GGG1 G1AAG1 


TTCCCGCCGA 


GACTGGCATC 


GCCCTCTACT 


CCCTTCATGA 


TATGTCACCA 


540 


J.G1GA1G1GG 


CCGAGGCCAT 


GTTCCGCCAT 


GGTATGACGC 


GGCTCTATGC 


CGCCCTCCAT 


600 


LI 1GGGGG JLG 


AGGTCCTGCT 


GCCCCCTGGC 


AGATATCGCA 


CCGCATCGTA 


TTTGCTAATT 


660 


p zi t p n pp c t z\ 


GGCGCGTTGT 


GGTGACGTAT 


GAGGG 1 GA 1 A 


G 1 AG i GG 1 GG 


mm 7\ f*** TV TV /~yf~\ Ti /~i 

TTAGAACCAC 


720 




ACTTGCGCTC 


CTGGATTAGA 


AGGAGGAAGG 


i 1 AGGGGAGA 


GGATCCCCTC 


780 




GGGTTAGGGC 


CATTGGCTGC 




ILi 1 GG I GAG 


GGGAGGGGGG 


840 


gap»ppatpap 


CTATGCCTTA 


TGTTCCTTAC 


PPPPP.P.TPTA 


pp p zi rr t r* t n 

LUuAuo iHA 


ILjI GGGA1GG 


yuu 


ATCTTCGGCC 


CGGGTGGCAC 


CCCTTCCTTA 


TTPPPAAPPT 


PATf^PTPPAP 


TaZi^TPPZiPP 
i AAo 1 L (jAUL 




TTCCATGCTG 


TCCCTGCCCA TATTTGGGAC 


CGTPTTATP.P 




PAPPTTP.PAT 




GACCAAGCCT 


TTTGCTGCTC 


CCGTTTAATG 


ACCTACCTTC 


GCGGCATTAG 


CTACAAGGTC 


1080 


ACTGTTGGTA 


CCCTTGTGGC 


TAATGAAGGC 


TGGAATGCCT 


CTGAGGACGC 


CCTCACAGCT 


1140 


GTTATCACTG 


CCGCCTACCT 


TACCATTTGC 


CACCAGCGGT 


ATCTCCGCAC 


CCAGGCTATA 


1200 


TCCAAGGGGA 


TGCGTCGTCT 


GGAACGGGAG 


CATGCCCAGA 


AGTTTATAAC 


ACGCCTCTAC 


1260 


AGCTGGCTCT 


TCGAGAAGTC 


CGGCCGTGAT 


TACATCCCTG 


GCCGTCAGTT 


GGAGTTCTAC 


1320 


GCCCAGTGCA 


GGCGCTGGCT 


CTCCGCCGGC 


TTTCATCTTG 


ATCCACGGGT 


GTTGGTTTTT 


1380 


GACGAGTCGG 


CCCCCTGCCA 


TTGTAGGACC 


GCGATCCGTA 


AGGCGCTCTC 


AAAGTTTTGC 


1440 


TGCTTCATGA 


AGTGGCTTGG 


TCAGGAGTGC 


ACCTGCTTCC 


TTCAGCCTGC 


AGAAGGCGCC 


1500 


GTCGGCGACC 


AGGGTCATGA 


TAATGAAGCC 


TATGAGGGGT 


CCGATGTTGA 


CCCTGCTGAG 


1560 
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TCCGCCATTA 


GTGACATATC 


TGGGTCCTAT 


GTCGTCCCTG 


GCACTGCCCT 


CCAACCGCTC 


1620 


TACCAGGCCC 


TCGATCTCCC 


CGCTGAGATT 


GTGGCTCGCG 


CGGGCCGGCT 


GACCGCCACA 


1680 


GTAAAGGTCT 


CCCAGGTCGA 


TGGGCGGATC 


GATTGCGAGA 


CCCTTCTTGG 


TAACAAAACC 


1740 


TTTCGCACGT 


CGTTCGTTGA 


CGGGGCGGTC 


TTAGAGACCA ATGGCCCAGA 


GCGCCACAAT 


1800 


CTCTCCTTCG 


ATGCCAGTCA 


GAGCACTATG 


GCCGCTGGCC 


CTTTCAGTCT 


CACCTATGCC 


1860 


GCCTCTGCAG 


CTGGGCTGGA 


GGTGCGCTAT 

oo X wvuv>< x n x 


GTTGCTGCCG 


GGCTTGACCA 


TPGGGPGGTT 




TTTGPPCPPG 


GTGTTTPAPP 

O X O XXX \^rWs\*+ 


PPGGTPAGPP 


CCCGGCGAGG 


TTACCGCCTT 


PTHPTPTGPP 


1 QftO 


PTATAPAGGT 


TTAAPPHTGA 

X X riTiv \»/ VJ7 X VJii 


GGPPPAGPGP 


CATTCGCTGA 


TCGGTAACTT 


ATPPTTPPAT 
n x o o x x v_» v^n x 




ppTHAnn^AP 


TPATTPPIPPT 
x i x oov^Vw/ x 


X X V^UV^^V^yU 


TTTTCGCCCG 


GGCATGTTTG 


PPAPTPPPPT 
oono i V— oov,* l 


^ X U \J 


AATPPATTPT 

nn X ^/^n X X X 


GTftGPGAGAP 


PAPAPTTTAP 
v^nunu xxx 


ACCCGTACTT 


GGTCGGAGGT 


TPATPPPPTP 
x on i o\-^\_/0 1 *w/ 


91 fin 


TPTAftTPPAH 
no x v^v^no 


PPPPPPPTPA 


PTT APPTTTT 
l nOO llll 


ATGTCTGAGC 


CTTCTATACC 


TAPTAPPPPP 
1 no 1 noooV»,V^ 


9990 


nppaPHPPTA 
o^A^n\j.O\j>^ X n 




v^v_»v^ 1 V-/ 1 n^v^k^ 


CCCCCTGCAC 


CGGACCCTTC 






tpt^pppppp 


P CZ P T T P P T P ZX 


PPPPPPTTPT 


GGCGCTACCG 


CCGGGGCCCC 


PPPP Z1TZ1 Zir"T 
bov^n 1 nnL 1 


Z Jfi KJ 


^nonv^ o o 




k^o^v^ 1 o^ 1 


TTCACCTACC 


CGGATGGCTC 






o^-^oo^ 1 v^o*— 


loll V^ono 1 \s 


papatppapp 


TGGCTCGTTA 


ACGCGTCTAA 


TPTTPBPPaP 
loll bAOtrtL 




k^O^— 1 OOV^-O 


U^OOO^ i 1 lO 


pp a TPP a TTT 


TACCAAAGGT 


ACCCCGCCTC 


PTTTPBTPPT ■ 
tlil on 1 1 


9CL9A 


O^W 1^1 1 1 1 O 


1 on 1 utuuun 




GCGTACACAC 


TAACCCCCCG 


oL^nnlnnl 1 




pAPr;PTHTPf^ 


PPPPTPATTA 
^V_/^V^ioni In 


TAPPTTPPAA 
1 nOo 1 1 ounn 


CATAACCCAA 


AGAGGCTTGA 


PPPTPPTTDT 
oov^lot^l inl 


9^4 0 


rnnnaa aptt 


oV^ 1 LLLu^L 1 


PPPPAPPPPT 


GCATACCCGC 


TCCTCGGGAC 


L.ooCnl/ilnLx 


z / uu 


PAPf^TPPPf^A 
Unou 1 o^^on 


1 v_,oo^_*^^L*.rio 


1111 onOuuv 


TGGGAGCGGA ACCACCGCCC 


Ooooon 1 ono 


97 fin 


1 lui nv— iV*- X X v^y 


PTPAPPTTPP 
X unu'w x X Ob 


TPPPAPATPP 
1 ov^v^noni oo 


TTTGAGGCCA 


ATAGGCCGAC 


PPPPPPf^APT 




PTPAPTATAA 


PTPAPPATHT 

x unvjunl O X 


TPPAPPPAPA 


GCGAATCTGG 


CCATCGAGCT 


T^APTPAfiPP 

X Onv> X unuL>L- 


^ o O 


APAGATGTPG 


GPPPGGPPTG 


TGPPGGPTGT 


CGGGTCACCC 


CCGGCGTTGT 


TPAHTAPPAn 
1 v--no x nv^ono 




TTTACTGCAG 


GTGTGCCTGG 


ATCCGGCAAG 


TCCCGCTCTA 


TCACCCAAGC 


PGATGTGGAP 

Oil X O X (JWl\y 


3000 


GTTGTCGTGG 

\J X X VJ X \^\*J X WW 


TCCCGACGCG 


TGAGTTGCGT 

X X X WWW X 


AATGCCTGGC 


GCCGTCGCGG 


PTTTGPTGPT 

V^r XXX Ou X VJ\y X 


3060 


TTTACCCCGC 


ATACTGCCGC 


CAGAGTCACC 


CAGGGGCGCC 


GGGTTGTCAT 


TGATGAGGCT 


3120 


CCATCCCTCC 


CCCCTCACCT 


GCTGCTGCTC 


CACATGCAGC 


GGGCCGCCAC 


CGTCCACCTT 

WW X ' X X 


3180 


CTTGGCGACC 


CGAACCAGAT 


CCCAGCCATC 


GACTTTGAGC 


ACGCTGGGCT 


CGTCCCCGCC 


3240 


ATCAGGCCCG 


ACTTAGGCCC 


CACCTCCTGG 


TGGCATGTTA 


CCCATCGCTG 


GCCTGCGGAT 


3300 


GTATGCGAGC 


TCATCCGTGG 


TGCATACCCC 


ATGATCCAGA 


CCACTAGCCG 


GGTTCTCCGT 


3360 


TCGTTGTTCT 


GGGGTGAGCC 


TGCCGTCGGG 


CAGAAACTAG 


TGTTCACCCA 


GGCGGCCAAG 


3420 
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CCCGCCAACC 


CCGGCTCAGT 


GACGGTCCAC 


GAGGCGCAGG 


GCGCTACCTA 


PAPGGAGAPP 




AC TAT TAT TG 


CCACAGCAGA 


TGCCCGGGGC 


CTTATTCAGT 

^* X X X X \~s£±\J X 


CGTCTCGGGC 


TPATGPPATT 

x x Ou x x 




GTTGCTCTGA 


CGCGCCACAC 


TGAGAAGTGC 


GTCATCATTG 


ACGCACCAGG 


CCTGCTTCGC 


3600 


GAGGTGGGCA 


TCTCCGATGC 


AATCGTTAAT 

* * X v \J X X 4Ml X 


AACTTTTTCC 

ruiv^ x x x x x 


TCGCTGGTGG 

X v^O X OO X oo 


PGAAATTGGT 

vunnn x x oo x 


JUUU 


CACCAGCGCC 

V*** lv V^* 1\J N«/ 


CATCAGTTAT 


TCCCCGTGGC 


AACPPTGAPG 

nriu^ x o«bo 


PPAATGTTGA 
v^v^riri x o x x OxA 


PAPPPTGGPT 




GCCTTCCCGC 


CGTCTTGCCA 


GATTAGTGCC 


TTCPATPAGT 


TGGPTGAGGA 

X OOw X OxiOOxV 


GPTTGGPPAP 

Ob/ X X 00\_/V^rTk.V-» 


^7R0 
J / ou 


AGACCTGTCC 


CTGTTGCAGC 


TGTTPTAPPA 


PPPTGPPPPG 


AGPTPGAAPA 


GGGPPTTPTP 
bbbbb J. IV/iij 




TACCTGCCCC 


AGGAGCTCAC 


PAPPTGTGAT 

v^riv/v-i x o x Oil X 


AGTGTPGTAA 
no x o x bo x rirt 


PATTTGAATT 
V^rA XXX ortrt X X 


flQPflPZiPQTT 
rirlV^,i-VofTiVw./i 1 1 




GTGCACTGCC 


GCATGGCCGC 

uv^n x oowv^ov^ 


W WO-fVJb brtO 


PGPAAGGPPG 


TGPTGTPPAP 

X OW X o X bbnb 


APTPP,TPP,P.P 
-rvbv X 1 ooob 


j you 


v_*o\-* x nuuuv o 


GTPGPAPAAA 


GPTPTAPAAT 

Ob 1U1 brori X 


GPTTPPPAPT 

Ob X X bbb-ttb X 


blbnlbl lbb 


PPflPTPTPTP 
bbrtb iblblb' 


/i Aon 


GCPPGTTTTA 


TPPPGGPPAT 


TGGPPPPGTA 

X uuv^^^Vj Xr\ 


PAGGTTAPA A 

brtOO X X rtb-rxtt. 


\* X X o X orxri X X 


PTZiPPQPPTa 
o I Hbbnbb X r\ 




gtggaggppa 


TGGTPGAGAA 

1 OO X bOrtOrtrt 


GGGPPAGGAT 


OOU X bbObbO 


i v^L* 1 1 bribL I 


x oH Ibll 1 bb- 


/I 1 A PI 

ft u 


aappgtgapg 

.rtrtbbO I OrlbO 


TGTPPAGGAT 

1U1 bbrtOOrt X 


PAPPTTPTTP 
brtbb i.iLi l b 


PAP.ZIZ1 ZiPZiTT 


o 1 rtH^--iirt.o x 1 


p a rT" zv a rp t 

bALbAbribb X 




GAGAPPATTG 


PPPATGGTAA 
bbbrt x Oo 1 rtri 


no X buuLLnb 


oobr\l b 1 Woo 


bL- X bbrtbbnn 


o/ibb 1 iUl bb 




GPPPTPTTTG 

Obbb XVjJ, X X O 


GPPPTTfiP,TT 


PPCPP.PT Z\TT 
bbobob l/ii 1 


ni\ n a z\ ci n c t a 

orlorlrioo^ 1 


TTPTPrrrr'T 


PPTPPPTPSP 
bl 1 bbb I bAb 




GGTGTGTTTT 


ZXPGGTGATGP 
riboo 1 urt x ob 


^111 ori 1 unl- 


i-iL/^O X ^ X lul 




bbbbbbAbbA 


/I O Q Pt 


AAGGPATPPA 


TGGTGTTTG A 

X OU X O X X X Urt 


GA ATGAPTTT 

Unn X Ori^ XXX 


1^1 onu X X X o 


I bbnbbbn 


or^riX/lrt.b ill 


H H H U 


TPTPTGGGTP 


TAGAGTGTGP 

X rtOiTlO X O X Ob 


TATTATGGZXG 
X r\ X Inl oo.rlo 


brio i o X oooM 


1 bbbbbnb 1 o 


c c t r* 7$. t r* p 1 pp 

ob 1 L»/ii Lbbb 


4 jUU 


CTGTATPAPP 


TTATAAGGTP 


TGPGTGGATP 
x uvu x oon x b 


TTGPAGGPPP 


PGA AGGAHTP 
bbnnbbnb X \^ 


lbl oLbflbbb 


*± D DU 


TTTTGGAAGA 


AAPAPTPPGG 


TGAGPPPGGP 


TxC r PC 1 V T VC"V AT 


GG A AT Z\PTP,T 
oo-rtM 1 nb X o 1 


b 1 ounH lHlb 




GCCGTTATTA 

vjv<>^vj x x n x x n 


PPPAPTGTTA 

bbbrtb X O X X 


TGAPTTPPGP 

X Ortb X X bbOb 


GATTTTPAGG 

OA X X X X v^AOO 


TGGPTGPPTT 
x buv x obb x x 


1 nnriub 1 o/i 1. 


4l DOLT 


GATTCGATAG 


TGPTTTGPAG 


TGAGTATPGT 

X vjAVj X / i X wO X 


PAGAGTPPAG 


GAGPTGPTGT 
unbb x bb x o x 


PPTGATPGPP 

bb X Ortl bObb 


*i f ft U 


GGCTGTGGCT 


TGAAGTTGAA 


GGTAGATTTP 


PGPPPGATPG 


GTTTGTATGP 

O X X X O X x o\_^ 


AGGTGTTGTG 

noo X O X X O X O 


fl O VJbl 


GTGGCCCCCG 


GCCTTGGCGC 


GCTCCCTGAT 


GTTGTGCGCT 

Vj X X O X WVWV X 


TPGPPGGPPG 

X v^OV-*V-*OOWV^O 


GPTTAPPGAG 

Ob X X-fAbbOrtO 


4 O ub* 


AAGAATTGGG 


GCCCTGGCCC 


TGAGCGGGCG 


GAGCAGCTCC 


GCCTCGCTGT 

VwJ V*-^ Vw^ X V>0*wr X V_J X 


TAGTGATTTP 
x n\j x un x x x b 




CTCCGCAAGC 


TCACGAATGT 


AGCTCAGATG 


TGTGTGGATG 


TTGTTTCCCG 

X X W XXX Vj'Vj'V'J 


TGTTTATGGG 

x o x x x n x ooo 




GTTTCCCCTG 


GACTCGTTCA 


TAACCTGATT 


GGCATGCTAC 


AGGCTGTTGC 


TGATGGCAAG 


5040 


GCACATTTCA 


CTGAGTCAGT 


AAAACCAGTG 


CTCGACTTGA 


CAAATTCAAT 


CTTGTGTCGG 


5100 


GTGGAATGAA 


TAACATGTCT 


TTTGCTGCGC 


CCATGGGTTC 


GCGACCATGC 


GCCCTCGGCC 


5160 


TATTTTGTTG 


CTGCTCCTCA 


TGTTTTTGCC 


TATGCTGCCC 


GCGCCACCGC 


CCGGTCAGCC 


5220 


GTCTGGCCGC 


CGTCGTGGGC 


GGCGCAGCGG 


CGGTTCCGGC 


GGTGGTTTCT 


GGGGTGACCG 


5280 
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GGTTGATTCT 


CAGCCCTTCG 


CAATCCCCTA 


, TATTCATCCA 


ACCAACCCCT 


TCGCCCCCGA 


5340 


TGTCACCGCT 


GCGGCCGGGG 


CTGGACCTCG 


TGTTCGCCAA 


CCCGCCCGAC 


CACTCGGCTC 


5400 


CGCTTGGCGT 


GACCAGGCCC 


AGCGCCCCGC 


CGTTGCCTCA 


CGTCGTAGAC 


CTACCACAGC 


5460 


TGGGGCCGCG 


CCGCTAACGG 


CGGTCGCTCC 


GGCCCATGAC 


ACCCCGCCAG 


TGCCTGATGT 


5520 


CGACTCCCGC 


GGCGCCATCT 


TGCGCCGGCA 


GTATAACCTA 


TCAACATCTC 


CCCTTACCTC 


5580 


TTCCGTGGCC 


ACCGGCACTA 


ACCTGGTTCT 


TTATGCCGCC 


CCTCTTAGTC 


CGCTTTTACC 


5640 


CCTTCAGGAC 


GGCACCAATA 


CCCATATAAT 


GGCCACGGAA 


GCTTCTAATT 

*w / X X \^ X X*XX X X 


ATGPPPAGTA 




CCGGGTTGCC 


CGTGCCACAA 


TCCGTTACCG 


CCCGCTGGTC 


CCCAATGCTG 


TGGGPGGTTA 




CGCCATCTCC 


ATCTCATTCT 


GGCCACAGAC 


CACCACCACC 


CCGACGTCCG 


TTGATATGAA 


J O £, \s 


TTCAATAACC 


TCGACGGATG 


TTCGTATTTT 


AGTCCAGCCC 


GGCATAGCCT 


PTGAfiPTTftT 
\^ x Orio^ x x o x 


joou 


GATCCCAAGT 


GAGCGCCTAC 


ACTATCGTAA 


CCAAGGCTGG 


CGCTrrGTrn 


AflAPPTPTriP 


-J -7*1 U 


GGTGGCTGAG 


GAGGAGGCTA 


CCTCTGGTCT 


TGTTATGrTT 


TGPATAPATP 
x uun x nun x vjj 


PPTPAPTPHT 


DUUU 


AAATTCCTAT 


ACTAATACAC 


CCTATACCGG 


TGrrrTrnnG 




x 1 bV^LL 1 I 


DUDU 


GCTTGAGTTT 


CGCAACCTTA 


CCCCCGGTAA 


CACCAATACG 




HTTATTPPAf: 

w X X r\ X X 




CACTGCTCGC 


CACCGCCTTC 


GTCGCGGTGC 


GGACGGGAPT 


GPPGACIPTPA 




£i fin 


TGCTACCCGC 


TTTATGAAGG 


ACCTCTATTT 

*lw \^ X X XXX 


TACTAGTAPT 


AATGGTPTPP 


PTnAriATpnr: 




CCGCGGGATA 


GCCCTCACCC 


TGTTCAACCT 

X \J X X ViiflVv X 


TGPTGAPAPT 


PTflPTTnnPP 


PPPTf^PPf^AP 


DjUU 


AGAATTGATT 


TCGTCGGCTG 


GTGGCCAGCT 


GTTPTAPTPP 


PftTPPPfJTTP 


TPTPAf^PPA A 


DjDU 


TGGCGAGCCG 


ACTGTTAAGT 


TGTATACATC 


TGTAGAGAAT 


GCTPAGPAGG 


ATAAf^HTAT 


u*i 


TGCAATCCCG 


CATGACATTG 


ACCTCGGAGA 


ATCTCGTGTG 

XX X X vVJ X \J X \J 


GTTATTCAGG 


ATTATHATAA 
n x x /"i x on x nn 




CCAACATGAA 


CAAGATCGGC 


CGACGCCTTC 


TCCAGCCCCA 


TCGCGCCCTT 


TPTPTf^TPPT 

X V— ' X V— ' X V3 X X 




TCGAGCTAAT 


GATGTGCTTT 


GGCTCTCTCT 


CACCGCTGCC 


GAGTATGACC 


AGTPPAPTTA 




TGGCTCTTCG 


ACTGGCCCAG 


TTTATGTTTC 


TGACTCTGTG 


ACCTTGGTTA 


ATGTTGCGAC 


6660 


CGGCGCGCAG 


GCCGTTGCCC 


GGTCGCTCGA 


TTGGACCAAG 


GTCACACTTG 


ACGGTCGCCC 


6720 


CCTCTCCACC 


ATCCAGCAGT 


ACTCGAAGAC 


CTTCTTTGTC 


CTGCCGCTCC 


GCGGTAAGCT 


6780 


CTCTTTCTGG 


GAGGCAGGCA 


CAACTAAAGC 


CGGGTACCCT 


TATAATTATA 


ACACCACTGC 


6840 


TAGCGACCAA 


CTGCTTGTCG 


AGAATGCCGC 


CGGGCACCGG 


GTCGCTATTT 


CCACTTACAC 


6900 


UAL, I AbUL* i kj 


KdKd I LiU 1 b(j I C 


CCGTCTCCAT 


TTCTGCGGTT 


GCCGTTTTAG 


CCCCCCACTC 


6960 


TGCGCTAGCA 


TTGCTTGAGG 


ATACCTTGGA 


CTACCCTGCC 


CGCGCCCATA 


CTTTTGATGA 


7020 


TTTCTGCCCA 


GAGTGCCGCC 


CCCTTGGCCT 


TCAGGGCTGC 


GCTTTCCAGT 


CTACTGTCGC 


7080 


TGAGCTTCAG 


CGCCTTAAGA 


TGAAGGTGGG 


TAAAACTCGG 


GAGTTGTAGT 


TTATTTGCTT 


7140 



92 



GTGCCCCCCT TCTTTCTGTT GCTTATTTCT CATTTCTGCG TTCCGCGCTC CCTGA 



7195 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1693 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Glu Ala His Gin Phe He Lys Ala Pro Gly He Thr Thr Ala He 
15 10 15 

Glu Gin Ala Ala Leu Ala Ala Ala Asn Ser Ala Leu Ala Asn Ala Val 
20 25 30 

Val Val Arg Pro Phe Leu Ser His Gin Gin He Glu He Leu He Asn 
35 40 45 

Leu Met Gin Pro Arg Gin Leu Val Phe Arg Pro Glu Val Phe Trp Asn 
50 55 60 

His Pro He Gin Arg Val He His Asn Glu Leu Glu Leu Tyr Cys Arg 
65 70 75 80 

Ala Arg Ser Gly Arg Cys Leu Glu He Gly Ala His Pro Arg Ser He 

■ 85 90 95 

Asn Asp Asn Pro Asn Val Val His Arg Cys Phe Leu Arg Pro Val Gly 
100 105 110 

Arg Asp Val Gin Arg Trp Tyr Thr Ala Pro Thr Arg Gly Pro Ala Ala 
115 120 125 

Asn Cys Arg Arg Ser Ala Leu Arg Gly Leu Pro Ala Ala Asp Arg Thr 
130 135 140 

Tyr Cys Leu Asp Gly Phe Ser Gly Cys Asn Phe Pro Ala Glu Thr Gly 
145 150 155 160 

He Ala Leu Tyr Ser Leu His Asp Met Ser Pro Ser Asp Val Ala Glu 
165 170 175 

Ala Met Phe Arg His Gly Met Thr Arg Leu Tyr Ala Ala Leu His Leu 
180 185 190 

Pro Pro Glu Val Leu Leu Pro Pro Gly Thr Tyr Arg Thr Ala Ser Tyr 
195 200 205 

Leu Leu He His Asp Gly Arg Arg Val Val Val Thr Tyr Glu Gly Asp 
210 215 220 

Thr Ser Ala Gly Tyr Asn His Asp Val Ser Asn Leu Arg Ser Trp He 
225 230 235 240 

Arg Thr Thr Lys Val Thr Gly Asp His Pro Leu Val He Glu Arg Val 
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245 



250 



255 



Arg Ala lie Gly Cys His Phe Val Leu Leu Leu Thr Ala Ala Pro Glu 
260 265 270 

Pro Ser Pro Met Pro Tyr Val Pro Tyr Pro Arg Ser Thr Glu Val Tyr 
275 280 285 

Val Arg Ser lie Phe Gly Pro Gly Gly Thr Pro Ser Leu Phe Pro Thr 
290 295 300 

Ser Cys Ser Thr Lys Ser Thr Phe His Ala Val Pro Ala His lie Trp 
305 310 315 320 

Asp Arg Leu Met Leu Phe Gly Ala Thr Leu Asp Asp Gin Ala Phe Cys 
325 330 335 

Cys Ser Arg Leu Met Thr Tyr Leu Arg Gly lie Ser Tyr Lys Val Thr 
340 345 350 

Val Gly Thr Leu Val Ala Asn Glu Gly Trp Asn Ala Ser Glu Asp Ala 
355 360 365 

Leu Thr Ala Val lie Thr Ala Ala Tyr Leu Thr lie Cys His Gin Arg 
370 375 380 

Tyr Leu Arg Thr Gin Ala lie Ser Lys Gly Met Arg Arg Leu Glu Arg 
385 390 395 400 

Glu His Ala Gin Lys Phe lie Thr Arg Leu Tyr Ser Trp Leu Phe Glu 
405 410 415 

Lys Ser Gly Arg Asp Tyr lie Pro Gly Arg Gin Leu Glu Phe Tyr Ala 
420 425 430 

Gin Cys Arg Arg Trp Leu Ser Ala Gly Phe His Leu Asp Pro Arg Val 
435 440 445 

Leu Val Phe Asp Glu Ser Ala Pro Cys His Cys Arg Thr Ala lie Arg 
450 455 460 

Lys Ala Leu Ser Lys Phe Cys Cys Phe Met Lys Trp Leu Gly Gin Glu 
465 470 475 480 

Cys Thr Cys Phe Leu Gin Pro Ala Glu Gly Ala Val Gly Asp Gin Gly 
485 490 495 

His Asp Asn Glu Ala Tyr Glu Gly Ser Asp Val Asp Pro Ala Glu Ser 
500 5.05 510 

Ala lie Ser Asp He Ser Gly Ser Tyr Val Val Pro Gly Thr Ala Leu 
515 520 525 

Gin Pro Leu Tyr Gin Ala Leu Asp Leu Pro Ala Glu lie Val Ala Arg 
530 535 540 

Ala Gly Arg Leu Thr Ala Thr Val Lys Val Ser Gin Val Asp Gly Arg 
545 550 555 560 



lie Asp Cys Glu Thr Leu Leu Gly Asn Lys Thr Phe Arg Thr Ser Phe 
565 570 575 
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Val Asp Gly Ala Val Leu Glu Thr Asn Gly Pro Glu Arg His Asn Leu 
580 585 590 

Ser Phe Asp Ala Ser Gin Ser Thr Met Ala Ala Gly Pro Phe Ser Leu 
595 600 605 

Thr Tyr Ala Ala Ser Ala Ala Gly Leu Glu Val Arg Tyr Val Ala Ala 
610 615 620 

Gly Leu Asp His Arg Ala Val Phe Ala Pro Gly Val Ser Pro Arg Ser 
625 630 635 640 

Ala Pro Gly Glu Val Thr Ala Phe Cys Ser Ala Leu Tyr Arg Phe Asn 
645 650 655 

Arg Glu Ala Gin Arg His Ser Leu He Gly Asn Leu Trp Phe His Pro 
660 665 670 

Glu Gly Leu He Gly Leu Phe Ala Pro Phe Ser Pro Gly His Val Trp 
675 680 685 

Glu Ser Ala Asn Pro Phe Cys Gly Glu Ser Thr Leu Tyr Thr Arg Thr 
690 695 700 

Trp Ser Glu Val Asp Ala Val Ser Ser Pro Ala Arg Pro Asp Leu Gly 
705 710 715 720 

Phe Met Ser Glu Pro Ser He Pro Ser Arg Ala Ala Thr Pro Thr Leu 
725 730 735 

Ala Ala Pro Leu Pro Pro Pro Ala Pro Asp Pro Ser Pro Pro Pro Ser 
740 745 750 

Ala Pro Ala Leu Ala Glu Pro Ala Ser Gly Ala Thr Ala Gly Ala Pro 
755 760 765 

Ala He Thr His Gin Thr Ala Arg His Arg Arg Leu Leu Phe Thr Tyr 
770 775 780 

Pro Asp Gly Ser Lys Val Phe Ala Gly Ser Leu Phe Glu Ser Thr Cys 
785 790 795 800 

Thr Trp Leu Val Asn Ala Ser Asn Val Asp His Arg Pro Gly Gly Gly 
805 810 815 

Leu Cys His Ala Phe Tyr Gin Arg Tyr Pro Ala Ser Phe Asp Ala Ala 
820 825 830 

Ser Phe Val Met Arg Asp Gly Ala Ala Ala Tyr Thr Leu Thr Pro Arg 
835 840 845 

Pro He He His Ala Val Ala Pro Asp Tyr Arg Leu Glu His Asn Pro 
850 855 860 

Lys Arg Leu Glu Ala Ala Tyr Arg Glu Thr Cys Ser Arg Leu Gly Thr 
865 870 875 880 

Ala Ala Tyr Pro Leu Leu Gly Thr Gly He Tyr Gin Val Pro He Gly 
885 890 895 



Pro Ser Phe Asp Ala Trp Glu Arg Asn His Arg Pro Gly Asp Glu Leu 
900 905 910 
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Tyr Leu Pro Glu Leu Ala Ala Arg Trp Phe Glu Ala Asn Arg Pro Thr 
915 920 925 

Arg Pro Thr Leu Thr lie Thr Glu Asp Val Ala Arg Thr Ala Asn Leu 
930 935 940 

Ala lie Glu Leu Asp Ser Ala Thr Asp Val Gly Arg Ala Cys Ala Gly 
945 950 955 960 

Cys Arg Val Thr Pro Gly Val Val Gin Tyr Gin Phe Thr Ala Gly Val 
965 970 975 

Pro Gly Ser Gly Lys Ser Arg Ser lie Thr Gin Ala Asp Val Asp Val 
980 985 990 

Val Val Val Pro Thr Arg Glu Leu Arg Asn Ala Trp Arg Arg Arg Gly 
995 1000 1005 

Phe Ala Ala Phe Thr Pro His Thr Ala Ala Arg Val Thr Gin Gly Arg 
1010 1015 1020 

Arg Val Val lie Asp Glu Ala Pro Ser Leu Pro Pro His Leu Leu Leu 
1025 1030 1035 1040 

Leu His Met Gin Arg Ala Ala Thr Val His Leu Leu Gly Asp Pro Asn 
1045 1050 1055 

Gin lie Pro Ala lie Asp Phe Glu His Ala Gly Leu Val Pro Ala lie 
1060 1065 1070 

Arg Pro Asp Leu Gly Pro Thr Ser Trp Trp His Val Thr His Arg Trp 
1075 1080 1085 

Pro Ala Asp Val Cys Glu Leu He Arg Gly Ala Tyr Pro Met He Gin 
1090 1095 1100 

Thr Thr Ser Arg Val Leu Arg Ser Leu Phe Trp Gly Glu Pro Ala Val 
1105 1110 1115 1120 

Gly Gin Lys Leu Val Phe Thr Gin Ala Ala Lys Pro Ala Asn Pro Gly 
1125 1130 1135 

Ser Val Thr Val His Glu Ala Gin Gly Ala Thr Tyr Thr Glu Thr Thr 
1140 1145 1150 

He He Ala Thr Ala Asp Ala Arg Gly Leu lie Gin Ser Ser Arg Ala 
1155 1160 1165 

His Ala He Val Ala Leu Thr Arg His Thr Glu Lys Cys Val He He 
1170 1175 1180 

Asp Ala Pro Gly Leu Leu Arg Glu Val Gly He Ser Asp Ala He Val 
1185 1190 1195 1200 

Asn Asn Phe Phe Leu Ala Gly Gly Glu He Gly His Gin Arg Pro Ser 
1205 1210 1215 

Val He Pro Arg Gly Asn Pro Asp Ala Asn Val Asp Thr Leu Ala Ala 
1220 1225 1230 

Phe Pro Pro Ser Cys Gin He Ser Ala Phe His Gin Leu Ala Glu Glu 
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1235 1240 1245 

Leu Gly His Arg Pro Val Pro Val Ala Ala Val Leu Pro Pro Cys Pro 
1250 1255 1260 

Glu Leu Glu Gin Gly Leu Leu Tyr Leu Pro Gin Glu Leu Thr Thr Cys 
1265 1270 1275 1280 

Asp Ser Val Val Thr Phe Glu Leu Thr Asp lie Val His Cys Arg Met 
1285 1290 1295 

Ala Ala Pro Ser Gin Arg Lys Ala Val Leu Ser Thr Leu Val Gly Arg 
1300 1305 1310 

Tyr Gly Gly Arg Thr Lys Leu Tyr Asn Ala Ser His Ser Asp Val Arg 
1315 1320 1325 

Asp Ser Leu Ala Arg Phe lie Pro Ala lie Gly Pro Val Gin Val Thr 
1330 1335 1340 

Thr Cys Glu Leu Tyr Glu Leu Val Glu Ala Met Val Glu Lys Gly Gin 
1345 1350 1355 1360 

Asp Gly Ser Ala Val Leu Glu Leu Asp Leu Cys Asn Arg Asp Val Ser 
1365 1370 1375 

Arg lie Thr Phe Phe Gin Lys Asp Cys Asn Lys Phe Thr Thr Gly Glu 
1380 1385 1390 

Thr lie Ala His Gly Lys Val Gly Gin Gly He Ser Ala Trp Ser Lys 
1395 1400 1405 

Thr Phe Cys Ala Leu Phe Gly Pro Trp Phe Arg Ala He Glu Lys Ala 
1410 1415 1420 

He Leu Ala Leu Leu Pro Gin Gly Val Phe Tyr Gly Asp Ala Phe Asp 
1425 1430 1435 1440 

Asp Thr Val Phe Ser Ala Ala Val Ala Ala Ala Lys Ala Ser Met Val 
1445 1450 1455 

Phe Glu Asn Asp Phe Ser Glu Phe Asp Ser Thr Gin Asn Asn Phe Ser 
1460 1465 1470 

Leu Gly Leu Glu Cys Ala He Met Glu Glu Cys Gly Met Pro Gin Trp 
1475 1480 1485 

Leu He Arg Leu Tyr His Leu He Arg Ser Ala Trp He Leu Gin Ala 
1490 1495 1500 

Pro Lys Glu Ser Leu Arg Gly Phe Trp Lys Lys His Ser Gly Glu Pro 
1505 1510 1515 1520 

Gly Thr Leu Leu Trp Asn Thr Val Trp Asn Met Ala Val He Thr His 
1525 1530 1535 

Cys Tyr Asp Phe Arg Asp Phe Gin Val Ala Ala Phe Lys Gly Asp Asp 
1540 1545 1550 

Ser He Val Leu Cys Ser Glu Tyr Arg Gin Ser Pro Gly Ala Ala Val 
1555 1560 1565 
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Leu lie Ala Gly Cys Gly Leu Lys Leu Lys Val Asp Phe Arg Pro lie 
1570 1575 1580 

Gly Leu Tyr Ala Gly Val Val Val Ala Pro Gly Leu Gly Ala Leu Pro 
1585 1590 1595 1600 

Asp Val Val Arg Phe Ala Gly Arg Leu Thr Glu Lys Asn Trp Gly Pro 
1605 1610 1615 

Gly Pro Glu Arg Ala Glu Gin Leu Arg Leu Ala Val Ser Asp Phe Leu 
1620 1625 1630 

Arg Lys Leu Thr Asn Val Ala Gin Met Cys Val Asp Val Val Ser Arg 
1635 1640 1645 

Val Tyr Gly Val Ser Pro Gly Leu Val His Asn Leu lie Gly Met Leu 
1650 * 1655 1660 

Gin Ala Val Ala Asp Gly Lys Ala His Phe Thr Glu Ser Val Lys Pro 
1665 1670 1675 1680 

Val Leu Asp Leu Thr Asn Ser lie Leu Cys Arg Val Glu 
1685 1690 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 660 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Arg Pro Arg Pro lie Leu Leu Leu Leu Leu Met Phe Leu Pro Met 
1 5 10 15 

Leu Pro Ala Pro Pro Pro Gly Gin Pro Ser Gly Arg Arg Arg Gly Arg 
20 25 30 

Arg Ser Gly Gly Ser Gly Gly Gly Phe Trp Gly Asp Arg Val Asp Ser 
35 40 . 45 

Gin Pro Phe Ala lie Pro Tyr lie His Pro Thr Asn Pro Phe Ala Pro 
50 55 60 

Asp Val Thr Ala Ala Ala Gly Ala Gly Pro Arg Val Arg Gin Pro Ala 
65 70 75 80 

Arg Pro Leu Gly Ser Ala Trp Arg Asp Gin Ala Gin Arg Pro Ala Val 
85 90 95 

Ala Ser Arg Arg Arg Pro Thr Thr Ala Gly Ala Ala Pro Leu Thr Ala 
100 105 110 

Val Ala Pro Ala His Asp Thr Pro Pro Val Pro Asp Val Asp Ser Arg 
115 120 125 

Gly Ala lie Leu Arg Arg Gin Tyr Asn Leu Ser Thr Ser Pro Leu Thr 
130 135 140 
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Ser Ser Val Ala Thr Gly Thr Asn Leu Val Leu Tyr Ala Ala Pro Leu 
145 150 155 160 

Ser Pro Leu Leu Pro Leu Gin Asp Gly Thr Asn Thr His lie Met Ala 
165 170 175 

Thr Glu Ala Ser Asn Tyr Ala Gin Tyr Arg Val Ala Arg Ala Thr lie 
180 185 190 

Arg Tyr Arg Pro Leu Val Pro Asn Ala Val Gly Gly Tyr Ala lie Ser 
195 200 205 

lie Ser Phe Trp Pro Gin Thr Thr Thr Thr Pro Thr Ser Val Asp Met 
210 215 220 

Asn Ser He Thr Ser Thr Asp Val Arg He Leu Val Gin Pro Gly He 
225 230 235 240 

Ala Ser Glu Leu Val He Pro Ser Glu Arg Leu His Tyr Arg Asn Gin 
245 250 255 

Gly Trp Arg Ser Val Glu Thr Ser Gly Val Ala Glu Glu Glu Ala Thr 
260 265 270 

Ser Gly Leu Val Met Leu Cys He His Gly Ser Leu Val Asn Ser Tyr 
275 280 285 

Thr Asn Thr Pro Tyr Thr Gly Ala Leu Gly Leu Leu Asp Phe Ala Leu 
290 295 300 

Glu Leu Glu Phe Arg Asn Leu Thr Pro Gly Asn Thr Asn Thr Arg Val 
305 310 315 320 

Ser Arg Tyr Ser Ser Thr Ala Arg His Arg Leu Arg Arg Gly Ala Asp 
325 330 335 

Gly Thr Ala Glu Leu Thr Thr Thr Ala Ala Thr Arg Phe Met Lys Asp 
340 345 350 

Leu Tyr Phe Thr Ser Thr Asn Gly Val Gly Glu He Gly Arg Gly He 
355 360 365 

Ala Leu Thr Leu Phe Asn Leu Ala Asp Thr Leu Leu Gly Gly Leu Pro 
370 375 380 

Thr Glu Leu He Ser Ser Ala Gly Gly Gin Leu Phe Tyr Ser Arg Pro 
385 390 395 400 

Val Val Ser Ala Asn Gly Glu Pro Thr Val Lys Leu Tyr Thr Ser Val 
405 410 415 

Glu Asn Ala Gin Gin Asp Lys Gly He Ala lie Pro His Asp He Asp 
420 425 430 

Leu Gly Glu Ser Arg Val Val He Gin Asp Tyr Asp Asn Gin His Glu 
435 440 445 



Gin Asp Arg Pro Thr Pro Ser Pro Ala Pro Ser Arg Pro Phe Ser Val 
450 455 460 
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Leu Arg Ala Asn Asp Val Leu Trp Leu Ser Leu Thr Ala Ala Glu Tyr 
465 470 475 480 

Asp Gin Ser Thr Tyr Gly Ser Ser Thr Gly Pro Val Tyr Val Ser Asp 
485 490 495 

Ser Val Thr Leu Val Asn Val Ala Thr Gly Ala Gin Ala Val Ala Arg 
500 505 510 

Ser Leu Asp Trp Thr Lys Val Thr Leu Asp Gly Arg Pro Leu Ser Thr 
515 520 525 

lie Gin Gin Tyr Ser Lys Thr Phe Phe Val Leu Pro Leu Arg Gly Lys 
530 535 540 

Leu Ser Phe Trp Glu Ala Gly Thr Thr Lys Ala Gly Tyr Pro Tyr Asn 
545 550 555 560 

Tyr Asn Thr Thr Ala Ser Asp Gin Leu Leu Val Glu Asn Ala Ala Gly 
565 570 575 

His Arg Val Ala lie Ser Thr Tyr Thr Thr Ser Leu Gly Ala Gly Pro 
580 585 590 

Val Ser He Ser Ala Val Ala Val Leu Ala Pro His Ser Ala Leu Ala 
595 600 605 

Leu Leu Glu Asp Thr Leu Asp Tyr Pro Ala Arg Ala His Thr Phe Asp 
610 615 620 

Asp Phe Cys Pro Glu Cys Arg Pro Leu Gly Leu Gin Gly Cys Ala Phe 
625 630 635 * 640 

Gin Ser Thr Val Ala Glu Leu Gin Arg Leu Lys Met Lys Val Gly Lys 
645 650 655 

Thr Arg Glu Leu 
660 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 123 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Asn Asn Met Ser Phe Ala Ala Pro Met Gly Ser Arg Pro Cys Ala 
15 10 15 

Leu Gly Leu Phe Cys Cys Cys Ser Ser Cys Phe Cys Leu Cys Cys Pro 
20 25 30 

Arg His Arg Pro Val Ser Arg Leu Ala Ala Val Val Gly Gly Ala Ala 
35 40 45 

Ala Val Pro Ala Val Val Ser Gly Val Thr Gly Leu He Leu Ser Pro 
50 55 60 
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Ser Gin Ser Pro He Phe He Gin Pro Thr Pro Ser Pro Pro Met Ser 
65 70 75 80 

Pro Leu Arg Pro Gly Leu Asp Leu Val Phe Ala Asn Pro Pro Asp His 
85 90 95 

Ser Ala Pro Leu Gly Val Thr Arg Pro Ser Ala Pro Pro Leu Pro His 
100 105 110 

Val Val Asp Leu Pro Gin Leu Gly Pro Arg Arg 
115 120 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7171 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: Composite Mexico strain 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



GCCATGGAGG 


CCCACCAGTT 


CATTAAGGCT 


CCTGGCATCA 


CTACTGCTAT 


TGAGCAAGCA 


60 


GCTCTAGCAG 


CGGCCAACTC 


CGCCCTTGCG 


AATGCTGTGG 


TGGTCCGGCC 


TTTCCTTTCC 


120 


CATCAGCAGG 


TTGAGATCCT 


TATAAATCTC 


ATGCAACCTC 


GGCAGCTGGT 


GTTTCGTCCT 


180 


GAGGTTTTTT 


GGAATCACCC 


GATTCAACGT 


GTTATACATA 


ATGAGCTTGA 


GCAGTATTGC 


240 


CGTGCTCGCT 


CGGGTCGCTG 


CCTTGAGATT 


GGAGCCCACC 


CACGCTCCAT 


TAATGATAAT 


300 


CCTAATGTCC 


TCCATCGCTG 


CTTTCTCCAC 


CCCGTCGGCC 


GGGATGTTCA 


GCGCTGGTAC 


360 


ACAGCCCCGA 


CTAGGGGACC 


TGCGGCGAAC 


TGTCGCCGCT 


CGGCACTTCG 


TGGTCTGCCA 


420 


CCAGCCGACC 


GCACTTACTG 


TTTTGATGGC 


TTTGCCGGCT 


GCCGTTTTGC 


CGCCGAGACT 


480 


GGTGTGGCTC 


TCTATTCTCT 


CCATGACTTG 


CAGCCGGCTG 


ATGTTGCCGA 


GGCGATGGCT 


540 


CGCCACGGCA 


TGACCCGCCT 


TTATGCAGCT 


TTCCACTTGC 


CTCCAGAGGT 


GCTCCTGCCT 


600 


CCTGGCACCT 


ACCGGACATC 


ATCCTACTTG 


CTGATCCACG 


ATGGTAAGCG 


CGCGGTTGTC 


660 


ACTTATGAGG 


GTGACACTAG 


CGCCGGTTAC 


AATCATGATG 


TTGCCACCCT 


CCGCACATGG 


720 


ATCAGGACAA 


CTAAGGTTGT 


GGGTGAACAC 


CCTTTGGTGA 


TCGAGCGGGT 


GCGGGGTATT 


780 


GGCTGTCACT 


TTGTGTTGTT 


GATCACTGCG 


GCCCCTGAGC 


CCTCCCCGAT 


GCCCTACGTT 


840 


CCTTACCCGC 


GTTCGACGGA 


GGTCTATGTC 


CGGTCTATCT 


TTGGGCCCGG 


CGGGTCCCCG 


'900 
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TCGCTGTTCC 


CGACCGCTTG 


TGCTGTCAAG 


TCCACTTTTC 


ACGCCGTCCC 


CACGCACATC 


960 


TGGGACCGTC 


TCATGCTCTT 


TGGGGCCACC 


CTCGACGACC 


AGGCCTTTTG 


CTGCTCCAGG 


1020 


CTTATGACGT 


ACCTTCGTGG 


CATTAGCTAT 


AAGGTAACTG 


TGGGTGCCCT 


GGTCGCTAAT 


1080 


GAAGGCTGGA 


ATGCCACCGA 


GGATGCGCTC 


ACTGCAGTTA 


TTACGGCGGC 


TTACCTCACA 


1140 


ATATGTCATC 


AGCGTTATTT 


GCGGACCCAG 


GCGATTTCTA 


AGGGCATGCG 


CCGGCTTGAG 


1200 


CTTGAACATG 


CTCAGAAATT 


TATTTCACGC 


CTCTACAGCT 


GGCTATTTGA 


GAAGTCAGGT 


1260 


CGTGATTACA 


TCCCAGGCCG 


CCAGCTGCAG 


TTCTACGCTC 


AGTGCCGCCG 


CTGGTTATCT 


1320 


GCCGGGTTCC 


ATCTCGACCC 


CCGCACCTTA 


GTTTTTGATG 


AGTCAGTGCC 


TTGTAGCTGC 


1380 


CGAACCACCA 


TCCGGCGGAT 


CGCTGGAAAA 


TTTTGCTGTT 


TTATGAAGTG 


GCTCGGTCAG 


1440 


GAGTGTTCTT 


GTTTCCTCCA 


GCCCGCCGAG 


GGGCTGGCGG 


GCGACCAAGG 


TCATGACAAT 


1500 


GAGGCCTATG 


AAGGCTCTGA 


TGTTGATACT 


GCTGAGCCTG 


CCACCCTAGA 


CATTACAGGC 


1560 


TCATACATCG 


TGGATGGTCG 


GTCTCTGCAA 


ACTGTCTATC 


AAGCTCTCGA 


CCTGCCAGCT 


1620 


GACCTGGTAG 


CTCGCGCAGC 


CCGACTGTCT 


GCTACAGTTA 


CTGTTACTGA 


AACCTCTGGC 


1680 


CGTCTGGATT 


GCCAAACAAT 


GATCGGCAAT 


AAGACTTTTC 


TCACTACCTT 


TGTTGATGGG 


1740 


GCACGCCTTG 


AGGTTAACGG 


GCCTGAGCAG 


CTTAACCTCT 


CTTTTGACAG 


CCAGCAGTGT 


1800 


AGTATGGCAG 


CCGGCCCGTT 


TTGCCTCACC 


TATGCTGCCG 


TAGATGGCGG 


GCTGGAAGTT 


1860 


CATTTTTCCA 


CCGCTGGCCT 


CGAGAGCCGT 


GTTGTTTTCC 


CCCCTGGTAA 


TGCCCCGACT 


1920 


GCCCCGCCGA 


GTGAGGTCAC 


CGCCTTCTGC 


TCAGCTCTTT 


ATAGGCACAA 


CCGGCAGAGC 


1980 


CAGCGCCAGT 


CGGTTATTGG 


TAGTTTGTGG 


CTGCACCCTG 


AAGGTTTGCT 


CGGCCTGTTC 


2040 


CCGCCCTTTT 


CACCCGGGCA 


TGAGTGGCGG 


TCTGCTAACC 


CATTTTGCGG 


CGAGAGCACG 


2100 


CTCTACACCC 


GCACTTGGTC 


CACAATTACA 


GACACACCCT 


TAACTGTCGG 


GCTAATTTCC 


2160 


GGTCATTTGG 


ATGCTGCTCC 


CCACTCGGGG 


GGGCCACCTG 


CTACTGCCAC 


AGGCCCTGCT 


2220 


GTAGGCTCGT 


CTGACTCTCC 


AGACCCTGAC 


CCGCTACCTG 


ATGTTACAGA 


TGGCTCACGC 


2280 


CCCTCTGGGG 


CCCGTCCGGC 


TGGCCCCAAC 


CCGAATGGCG 


TTCCGCAGCG 


CCGCTTACTA 


2340 


CACACCTACC 


CTGACGGCGC 


TAAGATCTAT 


GTCGGCTCCA 


TTTTCGAGTC 


TGAGTGCACC 


2400 


TGGCTTGTCA 


ACGCATCTAA . 


CGCCGGCCAC 


CGCCCTGGTG 


GCGGGCTTTG 


TCATGCTTTT 


2460 


TTTCAGCGTT 


ACCCTGATTC 


GTTTGACGCC 


ACCAAGTTTG 


TGATGCGTGA 


TGGTCTTGCC 


2520 


GCGTATACCC 


TTACACCCCG 


GCCGATCATT 


CATGCGGTGG 


CCCCGGACTA 


TCGATTGGAA 


2580 


CATAACCCCA 


AGAGGCTCGA 


GGCTGCCTAC 


CGCGAGACTT 


GCGCCCGCCG 


AGGCACTGCT 


2640 


GCCTATCCAC 


TCTTAGGCGC 


TGGCATTTAC 


CAGGTGCCTG 


TTAGTTTGAG 


TTTTGATGCC 


2700 


TGGGAGCGGA 


ACCACCGCCC 


GTTTGACGAG 


CTTTACCTAA 


CAGAGCTGGC 


GGCTCGGTGG 


2760 
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TTTGAATCCA 


ACCGCCCCGG 


TCAGCCCACG 


TTGAACATAA 


CTGAGGATAC 


CGCCCGTGCG 


2820 


GCCAACCTGG 


CCCTGGAGCT 


TGACTCCGGG 


AGTGAAGTAG 


GCCGCGCATG 


TGCCGGGTGT 


2880 


AAAGTCGAGC 


CTGGCGTTGT 


GCGGTATCAG 


TTTACAGCCG 


GTGTCCCCGG 


CTCTGGCAAG 


2940 


TCAAAGTCCG 


TGCAACAGGC 


GGATGTGGAT 


GTTGTTGTTG 


TGCCCACTCG 


CGAGCTTCGG 


3000 


AACGCTTGGC 


GGCGCCGGGG 


CTTTGCGGCA 


TTCACTCCGC 


ACACTGCGGC 


CCGTGTCACT 


3060 


AGCGGCCGTA 


GGGTTGTCAT 


TGATGAGGCC 


CCTTCGCTCC 


CCCCACACTT 


GCTGCTTTTA . 


3120 


CATATGCAGC 


GTGCTGCATC 


TGTGCACCTC 


CTTGGGGACC 


CGAATCAGAT 


CCCCGCCATA 


3180 


GATTTTGAGC 


ACACCGGTCT 


GATTCCAGCA 


ATACGGCCGG 


AGTTGGTCCC 


GACTTCATGG 


3240 


TGGCATGTCA 


CCCACCGTTG 


CCCTGCAGAT 


GTCTGTGAGT 


TAGTCCGTGG 


TGCTTACCCT 


330O 


AAAATCCAGA 


CTACAAGTAA 


GGTGCTCCGT 


TCCCTTTTCT 


GGGGAGAGCC 


AGCTGTCGGC 


3360 


CAGAAGCTAG 


TGTTCACACA 


GGCTGCTAAG 


GCCGCGCACC 


CCGGATCTAT 


AACGGTCCAT 


3420 


GAGGCCCAGG 


GTGCCACTTT 


TACCACTACA 


ACTATAATTG 


CAACTGCAGA 


TGCCCGTGGC 


3480 


CTCATACAGT 


CCTCCCGGGC 


TCACGCTATA 


GTTGCTCTCA 


CTAGGCATAC 


TGAAAAATGT 


3540 


GTTATACTTG 


ACTCTCCCGG 


CCTGTTGCGT 


GAGGTGGGTA 


TCTCAGATGC 


CATTGTTAAT 


3600 


AATTTCTTCC 


TTTCGGGTGG 


CGAGGTTGGT 


CACCAGAGAC 


CATCGGTCAT 


TCCGCGAGGC 


3660 


AACCCTGACC 


GCAATGTTGA 


CGTGCTTGCG 


GCGTTTCCAC 


CTTCATGCCA 


AATAAGCGCC 


3720 


TTCCATCAGC 


TTGCTGAGGA 


GCTGGGCCAC 


CGGCCGGCGC 


CGGTGGCGGC 


TGTGCTACCT 


3780 


CCCTGCCCTG 


AGCTTGAGCA 


GGGCCTTCTC 


TATCTGCCAC 


AGGAGCTAGC 


CTCCTGTGAC 


3840 


AGTGTTGTGA 


CATTTGAGCT 


AACTGACATT 


GTGCACTGCC 


GCATGGCGGC 


CCCTAGCCAA 


3900 


AGGAAAGCTG 


TTTTGTCCAC 


GCTGGTAGGC 


CGGTATGGCA 


GACGCACAAG 


GCTTTATGAT 


3960 


GCGGGTCACA 


CCGATGTCCG 


CGCCTCCCTT 


GCGCGCTTTA 


TTCCCACTCT 


CGGGCGGGTT 


4,020 


ACTGCCACCA 


CCTGTGAACT 


CTTTGAGCTT 


GTAGAGGCGA 


TGGTGGAGAA 


GGGCCAAGAC 


4080 


GGT.TCAGCCG 


TCCTCGAGTT 


GGATTTGTGC 


AGCCGAGATG 


TCTCCCGCAT 


AACCTTTTTC ' 


4140 


CAGAAGGATT 


GTAACAAGTT 


CACGACCGGC 


GAGACAATTG 


CGCATGGCAA 


AGTCGGTCAG 


4200 


GGTATCTTCC 


GCTGGAGTAA 


GACGTTTTGT 


GCCCTGTTTG 


GCCCCTGGTT 


CCGTGCGATT 


4260 


GAGAAGGCTA 


TTCTATCCCT 


TTTACCACAA 


GCTGTGTTCT 


ACGGGGATGC 


TTATGACGAC 


4320 


TCAGTATTCT 


CTGCTGCCGT 


GGCTGGCGCC 


AGCCATGCCA 


TGGTGTTTGA 


AAATGATTTT 


4380 


X \-> X \J.Ll\J ± J. J. V? 


ACTCGACTCA 


GAATAACTTT 

VJiiii X luiv XXX 


TCCCTAGGTC 


TTGAGTGCGC 


CATTATGGAA 


4'440 


GAGTGTGGTA 


TGCCCCAGTG 


GCTTGTCAGG 


TTGTACCATG 


CCGTCCGGTC 


GGCGTGGATC 


4500' 


CTGCAGGCCC 


CAAAAGAGTC 


TTTGAGAGGG 


TTCTGGAAGA 


AGCATTCTGG 


TGAGCCGGGC 


4560 


AGCTTGCTCT 


GGAATACGGT 


GTGGAACATG 


GCAATCATTG 


CCCATTGCTA 


TGAGTTCCGG 


4620 
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GACCTCCAGG 


TTGCCGCCTT 


CAAGGGCGAC 


GACTCGGTCG 


TCCTCTGTAG 


TGAATACCGC 


4680 


CAGAGCCCAG 


GCGCCGGTTC 


GCTTATAGCA 


GGCTGTGGTT 


TGAAGTTGAA 


GGCTGACTTC 


4740 


CGGCCGATTG 


GGCTGTATGC 


CGGGGTTGTC 


GTCGCCCCGG 


GGCTCGGGGC 


CCTACCCGAT 


4800 


GTCGTTCGAT 


TCGCCGGACG 


GCTTTCGGAG 


AAGAACTGGG 


GGCCTGATCC 


GGAGCGGGCA 


4860 


GAGCAGCTCC 


GCCTCGCCGT 


GCAGGATTTC 


CTCCGTAGGT 


TAACGAATGT 


GGCCCAGATT 


4920 


TGTGTTGAGG 


TGGTGTCTAG 


AGTTTACGGG 


GTTTCCCCGG 


GTCTGGTTCA 


TAACCTGATA 


4 980 


GGCATGCTCC 


AGACTATTGG 


TGATGGTAAG 


GCGCATTTTA 


CAGAGTCTGT 


TAAGCCTATA 


5040 


CTTGACCTTA 


CACACTCAAT 


TATGCACCGG 


TCTGAATGAA 


TAACATGTGG 


TTTGCTGCGC 


5100 


CCATGGGTTC 


GCCACCATGC 


GCCCTAGGCC 


TCTTTTGCTG 


TTGTTCCTCT 


TGTTTCTGCC 


5160 


TATGTTGCCC 


GCGCCACCGA 


CCGGTCAGCC 


GTCTGGCCGC 


CGTCGTGGGC 


GGCGCAGCGG 


5220 


CGGTACCGGC 


GGTGGTTTCT 


GGGGTGACCG 


GGTTGATTCT 


CAGCCCTTCG 


CAATCCCCTA 


5280 


TATTCATCCA 


ACCAACCCCT 


TTGCCCCAGA 


CGTTGCCGCT 


GCGTCCGGGT 


CTGGACCTCG 


5340 


CCTTCGCCAA 


CCAGCCCGGC 


CACTTGGCTC 


CACTTGGCGA 


GATCAGGCCC 


AGCGCCCCTC 


5400 


CGCTGCCTCC 


CGTCGCCGAC 


CTGCCACAGC 


CGGGGCTGCG 


GCGCTGACGG 


CTGTGGCGCC 


5460 


TGCCCATGAC 


ACCTCACCCG 


TCCCGGACGT 


TGATTCTCGC 


GGTGCAATTC 


TACGCCGCCA 


5520 


GTATAATTTG 


TCTACTTCAC 


CCCTGACATC 


CTCTGTGGCC 


TCTGGCACTA 


ATTTAGTCCT 


5580 


GTATGCAGCC 


CCCCTTAATC 


CGCCTCTGCC 


GCTGCAGGAC 


GGTACTAATA 


CTCACATTAT 


5640 


GGCCACAGAG 


GCCTCCAATT 


ATGCACAGTA 


CCGGGTTGCC 


CGCGCTACTA 


TCCGTTACCG 


5700 


GCCCCTAGTG 


CCTAATGCAG 


TTGGAGGCTA 


TGCTATATCC 


ATTTCTTTCT 


GGCCTCAAAC 


5760 


AACCACAACC 


CCTACATCTG 


TTGACATGAA 


T.TCCATTACT 


TCCACTGATG 


TCAGGATTCT 


5820 


TGTTCAACCT 


GGCATAGCAT 


CTGAATTGGT 


CATCCCAAGC 


GAGCGCCTTC 


ACTACCGCAA 


5880 


TCAAGGTTGG 


CGCTCGGTTG 


AGACATCTGG 


TGTTGCTGAG 


GAGGAAGCCA 


CCTCCGGTCT 


5940 


TGTCATGTTA 


TGCATACATG 


GCTCTCCAGT 


TAACTCCTAT 


ACCAATACCC 


CTTATACCGG 


6000 


TGCCCTTGGC 


TTACTGGACT 


TTGCCTTAGA 


GCTTGAGTTT 


CGCAATCTCA 


CCACCTGTAA 


6060 


CACCAATACA 


CGTGTGTCCC 


GTTACTCCAG 


CACTGCTCGT 


CACTCCGCCC 


GAGGGGCCGA 


6120 


CGGGACTGCG 


GAGCTGACCA 


CAACTGCAGC 


CACCAGGTTC 


ATGAAAGATC 


TCCACTTTAC 


6180 


CGGCCTTAAT 


GGGGTAGGTG 


AAGTCGGCCG 


CGGGATAGCT 


CTAACATTAC 


TTAACCTTGC 


6240 


TGACACGCTC 


CTCGGCGGGC 


TCCCGACAGA 


ATTAATTTCG 


TCGGCTGGCG 


GGCAACTGTT 


6300 


TTATTCCCGC 


CCGGTTGTCT 


CAGCCAATGG 


CGAGCCAACC 


GTGAAGCTCT 


ATACATCAGT 


6360 


GGAGAATGCT 


CAGCAGGATA 


AGGGTGTTGC 


TATCCCCCAC 


GATATCGATC 


TTGGTGATTC 


6420 


GCGTGTGGTC 


ATTCAGGATT 


ATGACAACCA 


GCATGAGCAG 


GATCGGCCCA 


CCCCGTCGCC 


6480 
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TGCGCCATCT 


CGGCCTTTTT 


CTGTTCTCCG 


AGCAAATGAT 


GTACTTTGGC 


TGTCCCTCAC 


6540 


TGCAGCCGAG 


TATGACCAGT 


CCACTTACGG 


GTCGTCAACT 


GGCCCGGTTT 


ATATCTCGGA 


6600 


CAGCGTGACT 


TTGGTGAATG 


TTGCGACTGG 


CGCGCAGGCC 


GTAGCCCGAT 


CGCTTGACTG 


6660 


GTCCAAAGTC 


ACCCTCGACG 


GGCGGCCCCT 


CCCGACTGTT 


GAGCAATATT 


CCAAGACATT 


6720 


CTTTGTGCTC 


CCCCTTCGTG 


GCAAGCTCTC 


CTTTTGGGAG 


GCCGGCACAA CAAAAGCAGG 


6780 


TTATCCTTAT 


AATTATAATA 


CTACTGCTAG 


TGACCAGATT 


CTGATTGAAA ATGCTGCCGG 


6840 


CCATCGGGTC 


GCCATTTCAA 


CCTATACCAC 


CAGGCTTGGG 


GCCGGTCCGG 


TCGCCATTTC 


6900 


TGCGGCCGCG 


GTTTTGGCTC 


CACGCTCCGC 


CCTGGCTCTG 


CTGGAGGATA 


CTTTTGATTA 


6960 




\j \* o v/nuAun x 


TTGATGACTT 

x x un j. wn\j x x 


CTGCCCTGAA 


TGCCGCGCTT 


TAGGCCTCCA 


7020 


GGGTTGTGCT 


TTCCAGTCAA 


CTGTCGCTGA 


GCTCCAGCGC 


CTTAAAGTTA 


AGGTGGGTAA 


7080 


AACTCGGGAG 


TTGTAGTTTA 


TTTGGCTGTG 


CCCACCTACT 


TATATCTGCT 


GATTTCCTTT 


7140 


ATTTCCTTTT 


TCTCGGTCCC 


GCGCTCCCTG 


A 






7171 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1575 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: T: Mexican strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

GTTGCGTGAG GTGGGTATCT CAGATGCCAT TGTTAATAAT TTCTTCCTTT CGGGTGGCGA 60 

GGTTGGTCAC CAGAGACCAT CGGTCATTCC GCGAGGCAAC CCTGACCGCA ATGTTGACGT 120 

GCTTGCGGCG TTTCCACCTT CATGCCAAAT AAGCGCCTTC CATCAGCTTG CTGAGGAGCT 180 

GGGCCACCGG CCGGCGCCGG TGGCGGCTGT GCTACCTCCC TGCCCTGAGC TTGAGCAGGG 240 

CCTTCTCTAT CTGCCACAGG AGCTAGCCTC CTGTGACAGT GTTGTGACAT TTGAGCTAAC 300 

TGACATTGTG CACTGCCGCA TGGCGGCCCC TAGCCAAAGG AAAGCTGTTT TGTCCACGCT 360 

GGTAGGCCGG TATGGCAGAC GCACAAGGCT TTATGATGCG GGTCACACCG ATGTCCGCGC 420 

CTCCCTTGCG CGCTTTATTC CCACTCTCGG GCGGGTTACT GCCACCACCT GTGAACTCTT 480 

TGAGCTTGTA GAGGCGATGG TGGAGAAGGG CCAAGACGGT TCAGCCGTCC TCGAGTTGGA. 540 
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TTTGTGCAGC 


CGAGATGTCT 


CCCGCATAAC 


CTTTTTCCAG 


AAGGATTGTA 


ACAAGTTCAC 


600 


GACCGGCGAG 


ACAATTGCGC 


ATGGCAAAGT 


CGGTCAGGGT 


ATCTTCCGCT 


GGAGTAAGAC 


r r r\ 

660 


CTTTTGTGCC 


CTGTTTGGCC 


CCTGGTTCCG 


TGCGATTGAG 


AAGGCTATTC 


TATCCCTTTT 


1 O f\ 

720 


ACCACAAGCT 


GTGTTCTACG 


GGGATGCTTA 


TGACGACTCA 


GTATTCTCTG 


CTGCCGTGGC 


i a a 

780 


TGGCGCCAGC 


CATGCCATGG 


TGTTTGAAAA 


TGATTTTTCT 


GAGTTTGACT 


CGACTCAGAA 


840 


TAACTTTTCC 


CTAGGTCTTG 


AGTGCGCCAT 


TATGGAAGAG 


TGTGGTATGC 


CCCAGTGGCT 


r\ rs r\ 

900 


TGTCAGGTTG 


TACCATGCCG 


TCCGGTCGGC 


GTGGATCCTG 


CAGGCCCCAA 


AAGAGTCTTT 


960 


GAGAGGGTTC 


TGGAAGAAGC 


ATTCTGGTGA 


GCCGGGCACG 


TTGCTCTGGA 


ATACGGTGTG 


1020 


GAACATGGCA 


ATCATTGCCC 


ATTGCTATGA 


GTTCCGGGAC 


CTCCAGGTTG 


CCGCCTTCAA 


1080 


GGGCGACGAC 


TCGGTCGTCC 


TCTGTAGTGA 


ATACCGCCAG 


AGCCCAGGCG 


CCGGTTCGCT 


1140 


TATAGCAGGC 


TGTGGTTTGA 


AGTTGAAGGC 


TGACTTCCGG 


CCGATTGGGC 


TGTATGCCGG 


1200 


GGTTGTCGTC 


GCCCCGGGGC 


TCGGGGCCCT 


ACCCGATGTC 


GTTCGATTCG 


CCGGACGGCT 


1260 


TTCGGAGAAG 


AACTGGGGGC 


CTGATCCGGA 


GCGGGCAGAG 


CAGCTCCGCC 


TCGCCGTGCA 


1320 


GGATTTCCTC 


CGTAGGTTAA 


CGAATGTGGC 


CCAGATTTGT 


GTTGAGGTGG 


TGTCTAGAGT 


1380 


TTACGGGGTT 


TCCCCGGGTC 


TGGTTCATAA 


CCTGATAGGC 


ATGCTCCAGA 


CTATTGGTGA 


1440 


TGGTAAGGCG 


CATTTTACAG 


AGTCTGTTAA 


GCCTATACTT 


GACCTTACAC 


ACTCAATTAT 


1500 


GCACCGGTCT 


GAATGAATAA 


CATGTGGTTT 


GCTGCGCCCA 


TGGGTTCGCC 


ACCATGCGCC 


1560 


CTAGGCCTCT 


TTTGC 










1575 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 874 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(iii) HYPOTHETICAL: NO 

.(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: Tashkent strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

CGGGCCCCGT ACAGGTCACA ACCTGTGAGT TGTACGAGCT AGTGGAGGCC ATGGTCGAGA 60 

AAGGCCAGGA TGGCTCCGCC GTCCTTGAGC TCGATCTCTG CAACCGTGAC GTGTCCAGGA 120 

TCACCTTTTT CCAGAAAGAT TGCAATAAGT TCACCACGGG AGAGACCATC GCCCATGGTA 180 
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AAGTGGGCCA GGGCATTTCG GCCTGGAGTA AGACCTTCTG TGCCCTTTTC GGCCCCTGGT 
TCCGTGCTAT TGAGAAGGCT ATTCTGGCCC TGCTCCCTCA GGGTGTGTTT TATGGGGATG 
CCTTTGATGA CACCGTCTTC TCGGCGCGTG TGGCCGCAGC AAAGGCGTCC ATGGTGTTTG 
AGAATGACTT TTCTGAGTTT GACTCCACCC AGAATAATTT TTCCCTGGGC CTAGAGTGTG 
CTATTATGGA GAAGTGTGGG ATGCCGAAGT GGCTCATCCG CTTGTACCAC CTTATAAGGT 
CTGCGTGGAT CCTGCAGGCC CCGAAGGAGT CCCTGCGAGG GTGTTGGAAG AAACACTCCG 
GTGAGCCCGG CACTCTTCTA TGGAATACTG TCTGGAACAT GGCCGTTATC ACCCATTGTT 
ACGATTTCCG CGATTTGCAG GTGGCTGCCT TTAAAGGTGA TGATTCGATA GTGCTTTGCA 
GTGAGTACCG TCAGAGTCCA GGGGCTGCTG TCCTGATTGC TGGCTGTGGC TTAAAGCTGA 
AGGTGGGTTT CCGTCCGATT GGTTTGTATG CAGGTGTTGT GGTGACCCCC GGCCTTGGCG 
CGCTTCCCGA CGTCGTGCGC TTGTCCGGCC GGCTTACTGA GAAGAATTGG GGCCCTGGCC 
CTGAGCGGGC GGAGCAGCTC CGCCTTGCTG TGCG 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: Clone 406.4-2 cDNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 100 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

C GCC AAC CAG CCC GGC CAC TTG GCT CCA CTT GGC GAG ATC AGG CCC 
Ala Asn Gin Pro Gly His Leu Ala Pro Leu Gly Glu He Arg Pro 
1 5 10 15 

AGC GCC CCT CCG CTG CCT CCC GTC GCC GAC CTG CCA CAG CCG GGG CTG 
Ser Ala Pro Pro Leu Pro Pro Val Ala Asp Leu Pro Gin Pro Gly Leu 
20 25 30 

CGG CGC TGACGGCTGT GGCGCCTGCC CATGACACCT CACCCGTCCC GGACGTTGAT 
Arg Arg 

TCTCGCGGTG CAATTCTACG CCGCCAGTAT AATTTGTCTA CTTCACCCCT GACATCCTCT 
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GTGGCCTCTG GCACTAATTT AGTCCTGTAT GCAGCCCCCC TTAATCCGCC TCTGCCGCTG 270 

CAGGACGGTA CTAATACTCA CATTATGGCC ACAGAGGCCT CCAATTATGC ACAGTACCGG 330 

GTTGCCCGCG CTACTATCCG TTACCGGCCC CTAGTGCCTA ATGCAGTTGG AGGCTATGCT 390 

ATATCCATTT CTTTCTGGCC TCAAACAACC ACAACCCCTA CATCTGTTGA CATGAATTC 44 9 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Ala Asn Gin Pro Gly His Leu Ala Pro Leu Gly Glu lie Arg Pro Ser 
15 10 15 

Ala Pro Pro Leu Pro Pro Val Ala Asp Leu Pro Gin Pro Gly Leu Arg 
20 25 30 

Arg 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 130 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: Clone 406.3-2 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 5.. 130 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GGAT ACT TTT GAT TAT CCG GGG CGG GCG CAC ACA TTT GAT GAC TTC TGC 49 
Thr Phe Asp Tyr Pro Gly Arg Ala His Thr Phe Asp Asp Phe Cys 
15 10 15 

CCT GAA TGC CGC GCT TTA GGC CTC CAG GGT TGT GCT TTC CAG TCA ACT 97 
Pro Glu Cys Arg Ala Leu Gly Leu Gin Gly Cys Ala Phe Gin Ser Thr 
20 25 30 

GTC GCT GAG CTC CAG CGC CTT AAA GTT AAG GTT 130 
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Val Ala Glu Leu Gin Arg Leu Lys Val Lys Val 
35 40 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Thr Phe Asp Tyr Pro Gly Arg Ala His Thr Phe Asp Asp Phe Cys Pro 
1 5 10 15 

Glu Cys Arg Ala Leu Gly Leu Gin Gly Cys Ala Phe Gin Ser Thr Val 
20 25 30 

Ala Glu Leu Gin Arg Leu Lys Val Lys Val 
35 40 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 406.4-2 epitope - Mexican strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Ala Asn Gin Pro Gly His Leu Ala Pro Leu Gly Glu lie Arg Pro Ser 
15 10 15 

Ala Pro Pro Leu Pro Pro Val Ala Asp Leu Pro Gin Pro Gly Leu Arg 
20 25 30 

Arg 

(2) INFORMATION FOR SEQ ID NO: 18: 



(i) SEQUENCE CHARACTERISTICS: 

(A) - LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single . 

(D) TOPOLOGY: unknown 



109 



(ii) MOLECULE TYPE: peptide 

{iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 406.4-2 epitope - Burma strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Ala Asn Pro Pro Asp His Ser Ala Pro Leu Gly Val Thr Arg Pro Ser 
15 10 15 

Ala Pro Pro Leu Pro His Val Val Asp Leu Pro Gin Leu Gly Pro Arg 
20 25 30 

Arg 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE:' 406.3-2 epitope - Mexican strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Thr Phe Asp Tyr Pro Gly Arg Ala His Thr Phe Asp Asp Phe Cys Pro 
15 10 15 

Glu Cys Arg Ala Leu Gly Leu Gin Gly Cys Ala Phe Gin Ser Thr Val 
20 25 30 

Ala Glu Leu Gin Arg Leu Lys Val Lys Val 
35 40 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: 406.3-2 epitope - Burma strain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Thr Leu Asp Tyr Pro Ala Arg Ala His Thr Phe Asp Asp Phe Cys Pro 
15 10 15 

Glu Cys Arg Pro Leu Gly Leu Gin Gly Cys Ala Phe Gin Ser Thr Val 
20 25 30 

Ala Glu Leu Gin Arg Leu Lys Met Lys Val 
35 40 
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